Optimizing the CVaR via Sampling
نویسندگان
چکیده
Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the gradient of the CVaR, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risksensitive controller for the game of Tetris.
منابع مشابه
Model Selection and Adaptive Markov chain Monte Carlo for Bayesian Cointegrated VAR model
In this paper, we develop novel Markov chain Monte Carlo sampling methodology for Bayesian Cointegrated Vector Auto Regression (CVAR) models. Here we focus on two novel extensions to the sampling methodology for the CVAR posterior distribution. The first extension we develop replaces the popular sampling methodology of the griddy Gibbs sampler with an automated alternative which is based on an ...
متن کاملRisk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures
In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk mea...
متن کاملComputation of VaR and CVaR using stochastic approximations and unconstrained importance sampling
Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) are two risk measures which are widely used in the practice of risk management. This paper deals with the problem of computing both VaR and CVaR using stochastic approximation (with decreasing steps): we propose a first Robbins-Monro procedure based on Rockaffelar-Uryasev’s identity for the CVaR. The convergence rate of this algorithm to ...
متن کاملPolicy Gradients for CVaR-Constrained MDPs
We study a risk-constrained version of the stochastic shortest path (SSP) problem, where the risk measure considered is Conditional Value-at-Risk (CVaR). We propose two algorithms that obtain a locally risk-optimal policy by employing four tools: stochastic approximation, mini batches, policy gradients and importance sampling. Both the algorithms incorporate a CVaR estimation procedure, along t...
متن کاملComputing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling
Value-at-Risk (VaR) and Conditional-Value-at-Risk (CVaR) are two risk measures which are widely used in the practice of risk management. This paper deals with the problem of estimating both VaR and CVaR using stochastic approximation (with decreasing steps): we propose a first Robbins-Monro (RM) procedure based on Rockafellar-Uryasev’s identity for the CVaR. Convergence rate of this algorithm t...
متن کامل